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Abstract 

We show that the equality set Eq(g, h) of two non-periodic binary 
morphisms g, h : A* — > S* is generated by at most two words. If the rank 
of Eq(p, h) = {a, /?}* is two, then a and /3 start (and end) with different 
letters. 

This in particular implies that any binary language has a test set of 
cardinality at most two. 
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which is in some places excessively complicated and discouraging. On the other 
hand, there are no new discoveries and the overall argument remains the same. 

1 Introduction 

Binary equality language, i.e., the set on which two binary morphisms agree, 
is the most simple non-trivial example of an equality language, the notion of 
which was introduced in [SJ. Equality languages in general play an important 
role in formal language theory. For a survey and bibliography see [HI Section 5] . 

In the binary case, the morphisms are defined on a monoid generated by 
two letters. It was for the first time extensively studied by K. Culik II and J. 
Karhumaki in [3]. There, the main claim of our work was conjectured, viz. that 
a binary equality language is generated by at most two words as soon as at least 
one of the morphisms is non-periodic (or, equivalently, injective). An important 
step towards the proof of the conjecture was made in [J] where the following 
partial characterization was obtained. 

Theorem 1. The equality set of two binary morphisms g,h: A* — > E* , where 
A = {a, b}, has the following structure: 

(A) If g and h are periodic, then either Eq(g,/i) = {e} or 

E q ( g) h) = {e}\J{aeA + | = k} 

w \a\b 

for some k > or k = oo. 

(B) // exactly one morphism is periodic, then 

Eq(ff, h) = a* 

for some word a G A* . 

(C) // both g and h are non-periodic, then either 

Eq(.g,^) = {a,/3}* 
for some words a, (3 £ A*, or 

Eq( 9) / l ) = (a 7 */3)* 
for some words a,/3,~f G A + . 

The question remained open whether the second possibility of case (fUj) . 
contradicting the conjecture, can actually occur. In the present paper we show 
that the answer is negative and, moreover, if a and (3 are both nonempty, they 
start (and end) with different letters. This is formulated in the following main 
theorem. 
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Theorem 2. Let g,h : A* — > S* be non-periodic binary morphisms. Let a and 
[3, with a ^ /3, be nonempty minimal elements o/Eq(g,/i). Then 

pref 1 (a) ^ pref 1 (/3) and suffi(a) ^ suffi(/3) . 

As a trivial consequence we have a solution of the original question. 

Theorem 3. Equality language of two nonperiodic binary morphisms is gener- 
ated by at most two words. 

I am not aware of any way how to prove Theorem [3] not using Theorem [3J 

Remark. Later, in [7 , it has been shown that the equality sets generated by 
two words have a precise form. Namely, the following theorem holds true. 

Theorem. Let g and h be distinct nonperiodic binary morphisms such that 
Eq(g, h) is generated by two words. Then there is a positive integer i such that 

Eq(g,h) = {a l b, ba 1 }*, 

up to renaming of the letters. 

The proof is based on Theorem [5] 

A closely related problem is the size of a test set for binary languages. Indeed, 
if two morphisms agree on a language, it must be a subset of their equality 
language. In [3], it is shown that all binary languages have a three element 
test set. Our result allows to cut down this bound to two. Let us remark 
that this improvement is not a simple consequence of the fact that the equality 
language is generated by two words — the difference in the first (or last) letter 
is a necessary ingredient. 

2 Preliminaries 

In this section, we fix our notation and recall some basic facts. For a reference 
and unproved claims see [2] or [5]. If £ is an alphabet, then let S* be the 
free monoid, and S + the free semigroup generated by E. The empty word is 
denoted by e. Any subset of S* is called a language. Let A denote the two-letter 
alphabet {a, b}. 

The length of the word is denoted by and \u\ x denotes the number of 
occurrences of the letter iin«. A prefix of u is any word v £ S* such that there 
exists a word v' G S* with u = vv' . The set of all prefixes of u is denoted by 
pref(it). A prefix v of u is proper if v ^ e and v ^ u. Similarly, suffix and proper 
suffix are defined. The set of all suffixes of u is denoted by suff(u). The first 
(the last resp.) letter of a nonempty word u is denoted by pref 1 (u) (suffi(u) 
resp.). A word v is called a factor of u if there exist words w, w 1 € S* such that 
u = wv w' . 

If v £ pref(w) or u S pref(i>), then we say that u and v are prefix- comparable 
(or simply comparable). The maximal common prefix of words u and v is de- 
noted by u A ii. If u and v are words, then the maximal u-prefix of v is the 



3 



maximal prefix of v that is also a prefix of u l for some i. Analogously, we define 
the maximal w-suffix of v. We say that two words are suffix- comparable if one 
is a suffix of the other. 

Positive powers u n of a word are defined as usual, with u° — e. We shall 
sometimes use also negative powers and work with elements of the free group, in 
order to simplify notation. This should not cause any confusion. For example, 
if u and v are comparable, then we shall write u _1 v even if v is a proper prefix 
of u. In such a case, when u — vw, we have u~ 1 v = w . 

We shall define regular languages by regular expressions in a standard way. 
In particular, the language {u 1 | i > 1} is denoted by u + and u* = u + U {e}. 
We say that v is a prefix (suffix, factor resp.) of u + if v is a prefix (suffix, factor 
resp.) of u % for some % > 1. 

A nonempty word u is called primitive if and only if u = v n implies u = v. 
The primitive root of a nonempty word u is the (uniquely given) primitive word 
r such that u G r + . Words u and v are called conjugate if u = ww 1 and v = w'w 
for some words w and w' . 

If we speak about minimality or maximality of some element, the implicit 
ordering is the prefix one, i.e., v < u if and only if v S pref(u), and v < u 
if moreover v 7^ u. (While by the shortest word we mean the word with the 
smallest length.) 

Let u E S + be a word u = I1I2 ■ ■ - Id, with d = \u\ and k G E. Then the 
reversal of the word u, denoted by u , is obtained by inverting the order of the 
letters, viz. 

u = Idd-i ■ ■ ■ h ■ 

Let g be an arbitrary morphism. The reversal of g is the morphism denoted by 
g , which has the same range and domain as g, and is defined by 

9 0) = 9{x) , 

for each ieE. Note that in general g (u) does not equal to g(u ) nor to g(u) . 
Instead 

g(u)= g(u) . 

All concepts and reasonings regarding prefixes are valid analogously for suffixes, 
reversals considered. We shall often use the fact. 

A morphism g defined on S is called erasing if g(x) is empty for some x G S. 
A morphism g is periodic if there is a word t such that g(x) G t*, for all words 
a; (or, equivalently, all letters x). Note that a binary morphism is periodic as 
soon as it is erasing. 

Let S = T + be a subsemigroup of S + generated by a set T. The rank of T 
is the cardinality of the minimal set generating S. We can write 

rank(T) = rank(S') = Card(S' \S-S). 

By the rank of a monoid M we mean the rank of the semigroup M \ {e}. 

It is a well known fact that for each set M C S + there exists the smallest 
free subsemigroup of S + containing M and called its free hull. A set generating 
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a free semigroup is called a code. If any two distinct elements of a code are 
neither prefix nor suffix comparable, the set is called a bifix code. 
The equality set of two morphisms g, h : A* — > E* is defined by 

Eq(g,h) = {ue A* \g(u) = h(u)}. 

It is easy to verify that the set Eq(<?, h) is a free submonoid of A* generated by 
the set of its minimal elements 

eq(g, h) = Eq(g, h) \ (Eq(g, h) \ {e}) 2 \ {e} . 

Note that eq(<?, h) is a bifix code. 

Let g : A* — > S* be a nonperiodic binary morphism. By z g we denote the 
maximal common prefix of g(ab) and g(ba), i.e. 

z g = g(ab) A g (ba) . 

Since g is nonperiodic, we have \z g \ < \g{a)\ + \g{b)\ by Lemma [5] below. If 
pref 1 (g(a)) ^ pref 1 (g(6)), i.e. z g — e, we say that g is marked. 

Similarly we define z g as the maximal common suffix of g(ab) and g(ba). 
Note that 

z g = g{ab) A g{ba) = 

and z g — e is equivalent to g being marked. 

Cartesian product A* x A* is the set of ordered pairs (u, v) of words. It 
can be seen as a monoid with operation of catenation defined by (u, v)(u' , v') — 
(uu',vv'), with the unit (e,e). Such a monoid is obviously not free, it is even 
not isomorphic to a submonoid of a free monoid. 

Let g, h : A* — > S* be two morphisms. The subset of A* x A* denoted by 
C(g, h) and defined by 

C(g,h) = {(u,v)\g(u) = h(v)} 

will be called the coincidence set of morphisms g and h. It is generated by the 
set 

c(g, h) = C(g, h) \ (C(g, h) \ {(e, e)}) 2 \ {(e, e)} . 

Any pair (it, v) € C(g,h) can be uniquely factorized into minimal pairs (ui,Vi) 
satisfying g{ui) = h(vi). This is formulated in the following lemma. 

Lemma 4. Let g and h be non-erasing morphisms. Then C(g, h) is, as a 
submonoid of A* x A*, freely generated by c(g,h). Moreover, the set c(g,h) is 
a bifix code. 

Note that (u, u) is an element of C(<?, h) for each u 6 Eq(g, h), and Eq(g, h) 
is given uniquely by <C(g,h) as 

Eq(g,h) = {u\ (u,u)€C(g,h)}. 

We present several combinatorial lemmas for future (often implicit) refer- 
ence. Following three lemmas are part of the folklore. 
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Lemma 5. The words u and v commute if and only if they have the same 
primitive root. 

Lemma 6 (Periodicity Lemma). Let u + and v + have a common prefix of length 
\u\ + \v\. Then the words u and v commute. 

We shall often use the following lemma. It is based on the well known fact 
that a primitive word t cannot satisfy equality tt = utv, with u and v nonempty. 

Lemma 7. (A) Let ww — uwv. Then u, v and w commute. 

(B) Let uw be a prefix of w + . Then u and w commute. 

(C) Let sw be a factor of w + . Then s is a suffix of w + . 

(D) Let uw be a suffix of w + and let w be a prefix of uw. Then u and w 
commute. 

(E) Let u\, U2> w, w' G S + be words such that w' and w are conjugate, 
\ui\ < \u2\, and the words u\w' , U2U)' are prefixes of w + . Then u% is a 
suffix of U2 and it2 - u 1 " 1 commutes with w. 

One more lemma, which is easy to prove: 

Lemma 8. Let g : A* — > A* be a marked morphism and let u,v G A* . Then 
g(uAv) = g(u) Ag(v). 

The following nice lemma is a key fact about binary morphisms. 

Lemma 9. Let X = {x,y} C E + be a nonperiodic set (i.e. xy ^ yx). Let 
u G xX* , v G yX* be words such that \u\, \v\ > \xy /\yx\. Then uAv = xy Ayx. 

The proof is not difficult (see [2], p. 348). The lemma immediately implies 
that for a nonperiodic binary morphism h and an arbitrary word u G A + long 
enough, the word Zh is a prefix of h(u) and the (\zh\ + l)-th letter of h(u) 
indicates the first letter of u. For any u, v € A* we have 

z h = h(au)z h A h{bv)z h . (1) 

It is now easy to see that the morphism h m such that 

h m (u) = z t ^ 1 h(u)z hl (2) 

u G A, is well defined. Moreover, it is marked, and the equality ^ holds for 
any u G A* . We shall call it the marked version of h. 

N.B. The case g = h is trivial. Throughout the paper we shall implicitly 
suppose g ^ h. 
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3 Principal morphisms 



In this section we show that at least one of the morphisms g and h can be 
supposed to be marked. As we shall see, this will make our research more 
convenient. The goal is achieved by choosing a suitable target alphabet. 

Definition 10. We say that an (unordered) pair of binary morphisms g, h : 
A* —> S* is principal if the target alphabet £ generates the free hull of the set 
{g(a),g(b),h(a),h(b)}. 

The previous definition reflects the use of the term "principal morphism" in 
literature (see for example [5], p. 170). The advantages of principal morphisms 
stem from the following important property. 

Lemma 11. Let X be a finite subset o/S* and let Y be the minimal generating 
set of the free hull of X . Then for each element y G Y there is a word x G X 
such that y is a prefix (suffix resp.) of x. 

For the proof see pQ, Lemma 3.1. For our purpose, note the following im- 
mediate corollary. 

Corollary 12. Let X be a finite subset of E* such that £ is the base of the free 
hull of X . Then 



It is quite intuitive that choosing the minimal generating set of the free hull 
as the target alphabet has no influence on the coincidence set of the morphisms. 
The following lemma is formulated for binary morphisms, but it can be trivially 
extended to any domain alphabet. 

Lemma 13. Let g\, hi be morphisms A* — > £*. Then there is a principal pair 
of morphisms g, h such that 



Moreover, if gi (hi, gi , hi resp.) is marked, then such is also g (h, g ,h resp.). 

Proof. Let F C S* be the free hull o the set {gi(a), gi(b), hi(a), hi(b)} and let 
C be an alphabet whose cardinality equals the rank of F. Then C* and F are 
isomorphic since they are both free monoids of the same rank; let <p : C* — >• F 
be an isomorphism. Define morphisms g,h : A* — > C* by 



S = {pref x (ti) «el}= {suffi(u) \ u e X}. 



C(g,h) = C(g 1 ,h 1 ). 



9 = l P 



l 



°9i 



h = (p o hi- 




C* 
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Then {g, h) is a principal pair of morphisms, the above diagram commutes, and 
C(g,h) — C(gi, hi). The rest is obvious. □ 

The previous lemma shows that we can always, without loss of generality, 
suppose that the pair we work with is principal. We can now prove that this 
brings about markedness of one of the morphisms. 

Lemma 14. Let g, h be nonperiodic principal morphisms, with eq(g, h) non- 
empty. Then at least one of the morphisms g, h is marked, and at least one of 
the morphisms g , h is marked. 

Proof. Suppose that none of the morphisms is marked, therefore 

prefj (5(a)) = prcf 1 ( 5 (6)), pref 1 (/i(a)) = pref 1 (/i(6)). 

Let x be a first letter of a word u G Eq(g, h). Then 

pref 1 (g(x)) = pref x (/i(x)), 

and Corollary [12] implies that the morphisms are periodic, a contradiction. 

Obviously, the morphisms g , h are also principal, since the concept of the 
free hull is preserved under the reversal symmetry. This concludes the proof. □ 



4 The block structure of the coincidence set 

In this section, we study the structure of the equality set of nonperiodic mor- 
phisms and their relation to the coincidence set. The previous section justifies 
why we shall always suppose that g is marked. 

Let u,v G S* be words such that g(u) and h(v) are comparable. Then the 
word h(v)~ 1 g(v) is called an overflow (the overflow may be a "negative" word if 
g(v) is a prefix of h(v)). Following lemmas show that the possibility to lengthen 
the words u, v to words u', v' such that g(u') = h(v') is very restricted. Namely, 
the overflow Zh is the only one admitting two different continuations. 

Lemma 15. Let g and h be binary morphisms, and let g be marked. Let u,v S 
A* be words such that g(u) and h(v) are comparable and let 

g{u) ^ h{v)z h . 

Let ui,U2,vi,V2 G A + be words such that 

g(uux) = h(vvx), g(uu 2 ) = h(vv 2 ). 

Then 

• pref^ui) = pref^tta), if \g(u)\ - \h(v)\ < \z h \; 

• pref x (ui) =pref 1 (u 2 ), if\g(u)\ - \h(v)\ > \z h \. 
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Proof. If Ui, v>2, vi and i>2 satisfy the conditions of the lemma, then the same 
conditions are satisfied also by the words u\uui, w 2 uu2, vivvi and V2VV2 resp. 
Hence we can suppose that each of the words Ui,U2,Vi,V2 is longer than z%. 
Consider three cases. 

l. First suppose that \g(u)\ < \h(v)\ + \zh\- By (TTJ), h(v)zh is a prefix of both 
h(vvi) and h{vv2) and 

pref 1 (g(wi)) = pref 1 (g(u 2 )) = pref 1 ( i g(ii) _1 /i(w)z, l ) = x. 

Since g is a marked morphism, this implies that pref 1 (ui) = pref 1 (u2). 







h(v) 




X 







2. Suppose on the other hand that > + \zh\- Then h(v\), h(v2) 

have the common prefix longer than Zh and pref 1 («i) = pref 1 («2) is determined 
by the letter x = pref 1 ((h(v)zh)~ 1 g(u)). 





X 




h(v) 


Zh 





3. If 1 17(11) I = \h(v)\ + \zh\, then, clearly, glu) — h(v)zh- 



a(u) 


h(v) 


Zh 



□ 

Previous lemma yields the following property. 

Lemma 16. Let g and h be binary morphisms, and let g be marked. Let (c, d) 
and (c',d') be distinct elements ofC(g,h), and suppose that c and cf are not 
comparable. Put 

u = c A c , v = d A d . 

Then 

g(u) = h(v)z h . 

Proof. We have c = uui and c' = UU2 where U\,U2 S A + and pref 1 (ui) ^ 
pref-^). 

If d and d' are not comparable, then d — vv\ and d' — W2 with v\ , v% 6 A + 
and pref 1 («i) 7^ pref^t^), and the claim follows from Lemma [151 
If d and d' are comparable, then \g(u) \ — \h(v)\ < < \zh\- Since 

g{uu\c) = h(vv\d), g(uu2c') = h(vv2d!) 

with U\C, U2c', vid, v\d' G A + , Lemma [TBI yields a contradiction with pref 1 («i) 7^ 
pref 1 (u 2 ). □ 
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Example 17. The previous corollary does not hold without the condition that c 
and d are not comparable. Consider morphisms 

g(a) = a, g(b) = b, 

h(a) = a, h(b) = aab. 

Then (c, d) = {a, a), (d,df) = (aab,b), Zh — aa, and 

g(c A d) = g(a) = a ^ aa = h(e)zh = h(d A d')zh- 

The ground for the characterization of the coincidence set is the following 
lemma. 

Lemma 18. Let g and h be binary morphisms, and let g be marked. Let the 
words e, / 6 A + satisfy following conditions: 

(i) z h g(e) = h(f)z h 

(ii) The words e, f are minimal, i.e.: Ifu < e, v < f and Zhg(u) — h(v)zh, 
then either u = v = e or u = e and v = f . 

Then, given the first letter of e or the first letter of f , the words e and f are 
determined uniquely. 

Proof. Suppose e, / and d , f satisfy (0) and (JTTJ) , and pref 1 (e) = pref 1 (e / ). Put 
c = e A e', d = f A /'. Since g is a marked morphism, we have 

z h g(e) A z h g(d) = z h g(c) (3) 

by Lemma [51 From (TTJ) we deduce 

h{f)z h Ah(f')z h = h(d)z h . (4) 

Since Zh,g(e) = h(f)zh and Zhg(d) = h(f')zh, the equalities (J3j) , (jU) yield 

z h g{c) = h(d)z h . 

Since c is nonempty, we deduce from ([Ii]) that c = e = d and d = f = f . 
Similarly if pref 1 (/) = pref 1 (/'). □ 

This implies the following lemma. 

Lemma 19. Let g and h be binary morphisms, and let g be marked. 

(A) The rank ofC(g,h m ) is at most two. 

(B) If the rank ofC(g,h m ) is two andc(g,h m ) = {(e, /), (e', /')}, then 

prefi(e) ^ pref^e') 
pref^/^prefi OH- 
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Proof. Recall that h m (u) = z h 1 h{u)zj- L to see that 



C(g,h m ) = {{u,v) e A* x A* | z h g(u) = h(v)z h } . 



The rest is a consequence of Lemma [T5] 



□ 



Note that (e, /) £ c(g, h m ) is just another formulation of the fact that e, / 
are minimal words satisfying z^gie) = h(f)zh, which are exactly conditions of 
Lemma [TBI The pairs (e, /) and (e', /') are often called blocks of g and h. 

The question on the structure of the equality set Eq(g, h) can be seen as a 
special case of the above considerations. If conditions 

u = v, ui=vi, u 2 — v 2 , c = d, c' = d', e = f, e'=f, 

are added, then we get the following modifications of Lemma 1151 Lemma 1161 
Lemma [TBI and Lemma [19] with analogous proofs, which we omit. 

Lemma 20. Let g and h be binary morphisms, and let g be marked. Let u E A* 
be a word such that g(u) and h[u) are comparable, and 



Then pref 1 (ui) = pref 1 (it2). 

Lemma 21. Let g and h be binary morphisms, and let g be marked. Let c and 
c' be incomparable elements o/Eq((;, h). Put u = c Ac' . Then 



Lemma 22. Let g and h be binary morphisms, and let g be marked. Let the 
word e G A + satisfy following conditions: 

(i) z h g(e) = h(e)z h 

(ii) The word e is minimal, i.e.: If e\ is a prefix of e and Zhg{e±) — h{e\)zh, 
then e\ — e or e\ — e. 

Then the word e is determined uniquely by its first letter. 

Lemma 23. Let g and h be binary morphisms, and let g be marked. 

(A) The rank o/Eq(g,ft, m ) is at most two. 

(B) If the rank o/Eq(<7, h m ) is two and eq(g,h m ) = {e, e'}, then 



g (u) ^ h{u)z h . 



Let u\,U2 € A + be words such that 



g{uux) 
g(uu 2 ) 



h(uu{), 
h(uu 2 ). 



g(u) = h(u)z h . 



prefi(e) ^ pref^e'). 
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Note that the previous lemma proves Theorem [2] for morphisms, which are 
marked from both sides. In the rest of the paper we show that this is essentially 
the only situation in which the equality set can have rank greater than one. 

Marked morphisms are in general much easier to deal with. That's why it is 
convenient to work with principal pairs, where one of the morphisms, say g, is 
marked. Moreover, it is always possible to use the marked version h m instead 
of h to get a marked pair, and thus a better insight into the coincidence set of 
g and h. 

The block structure of the coincidence set of marked morphisms leads to 
an important concept of successor morphisms introduced first in [5]. Consider 
marked morphisms g and h such that c(g, h) consists of two blocks (e, /) and 
(e',f). Let w be an element of Eq(g,/i). The equality g(w) — h(w) can be 
uniquely split into a sequence of blocks. This means that w is an element of 
{e, e'} + , and in the same time an element of {/, /'} + . It is now natural to define 
the successor morphisms (gi, hi) by 

9i(a) = e, \hi(a) = f, 
gi(b)=e\ |fci(&) = /', 

and to formulate the previous considerations by the following lemma. 

Lemma 24. Let g, h be marked morphisms such that 

c(g,h) = {(e,f), (e',f)}. 

Then the morphisms g\, h\ defined by ([5]) are marked. If w £ Eq(g,h), then 
there is a unique word w\ € Eq(gi,hi) such that 

gi(wi) = hi(wi) = w. 

Proof. The morphisms g\ and h\ are marked by Lemma fT9l The existence and 
uniqueness of the word W\ follows from {w,w) £ C(g,h), and from Lemma 
HI □ 



5 The counterexample and its structure 

We now have all necessary ingredients for the proof of our main claim, Theorem 
[2J The course of the prove will be essentially by contradiction. We shall assume 
that there exists a counterexample to the claim, and gradually show that such 
an assumption is contradictory. 

We first formulate what is understood as a counterexample. 

Definition 25. We say that a pair of morphisms (<?, h) is a counterexample if 

(a) The rank of Eq(g, h) is at least two; 

(b) g is marked and h is not marked; 
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(c) \g(a)\ > \h(a)\ and \g(b)\ < \h(b)\. 

The third condition takes advantage of the symmetry of letters a and b. 
Note that the strict inequalities do not harm generality, since \g(a)\ = \h(a)\ or 
| g(b) | = \h(b)\ would imply g — h. Since the letters a and b are not interchange- 
able anymore, we shall sometimes need the morphism tt defined by 7r(a) = b 
and ir(b) = a. 

The following lemma yields basic information about the structure of the 
equality set of a counterexample. 

Lemma 26. Let (g, h) be a counterexample. Then there exist nonempty words 
a, v a and v\, such that \a\ a > 1, 

pref^i/o) = a, pref^) = 6, 

the words av a , ovb are two distinct elements ofeq(g,h) and 

g (a) = h{a)z h , (6) 
Zhg{v a ) = h{v a ), (7) 
Zhg{v b ) = h(i/ b ). (8) 



h 



Vl 



Proof. Let u and v be two distinct elements of eq(g,h). Note that u and v 
are not comparable, and put a — u A v, u\ — a~ 1 u and v\ = rr -1 . Clearly, 
pref 1 (ui) ^ pref 1 (vi) and the choice of v a and v b is now obvious. The equalities 
((6|), (J7J) and ([8]) are yielded by Lemma I2T1 and |cr| Q > 1 follows from \g(b)\ < 
\h(b)\. □ 

The equalities ([6]) , (J7]) and (JSl) are of a special importance in the proof. They 
represent two points, where the structure of a counterexample is well defined, 
and which therefore yield information for a combinatorial analysis. 

The following lemma makes sure that the counterexample defined above 
deserves its name. 

Lemma 27. Let g\ and hi be nonperiodic binary morphisms such that eq(gi, h\) 
contains two elements a and /3 with the same first letter. Then there is a 
counterexample (g, h) such that Eq(g, h) = Eq(<7i, hi). 

Moreover, if~g~\ (h\ resp.) is marked, then alsog (h resp.) is marked. 

Proof. Lemma [13] yields principal morphisms g and h such that Eq(g, h) — 
Eq(<?x> hi). By Lemma [HI and by the symmetry of g and h, we can suppose 
that g is marked. Similarly, by the symmetry of a and b, we can suppose that 
the condition (jcj) of Definition [25] is satisfied. In order to see that (g, h) is a 
counterexample, it remains to show that h is not marked. If h is marked, then 
both morphisms are marked, and pref 1 (a) ^ pref 1 (/3) by Lemma 1231 contrary 
to the assumption. 

Markedness of reversals is conserved by Lemma [131 □ 
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The further strategy is to show that there is no counterexample. We shall 
divide the investigation into several stages. 

6 When Zh commutes 

In this section we investigate two special situations, in which Zh commutes with 
one of the image words. We show that those situations lead to a contradiction. 
We start with a technical lemma, which will be the core of the proof. In the 
original version of this paper the claim had the following strong form: 

Lemma. Let g, h : A* -> 4* be two marked morphisms. Let u, u', v and 
v' E A* be words, and s, r, q positive integers such that 

g(a s bu) = h(a s bu'), g(a r bv) = h(a q bv'). 

Then s = r = q. 

However, as Markku Laine pointed out by constructing an example, this 
claim does not hold. The example is as follows. 

Example 28. Let 

g(a) = a 2 b 2 , h(a) = a, 

9(b) = b, h(b) = b 2 . 

Then 

g(a 2 b 2 ) = h(a 2 ba 2 bb) = a 2 b 2 a 2 b\ 

and 

g(ab 2 ) = h(a 2 b 2 ) = a 2 b\ 

We therefore present a bit weaker version, which fits the purpose of this 
paper. 

Lemma 29. Let g,h : A* — > A* be two marked morphisms. Let u and v € A* 
be words, and s, r, q positive integers such that 

g{a s bu) = h(a s bu), (9) 
g(a r bv) = h(a q bv). (10) 

Then s = r = q. 

Proof. Recall that we suppose g ^ h. (Obviously, only r = q is forced if g = h.) 
Let g and h be morphisms satisfying assumptions, and suppose that s = r = q 
does not hold. Assume, moreover, that g and h are chosen such that the length 
of a s bu is the smallest possible. We show that a s bu can be shortened, and hence 
obtain a contradiction. 

We first prove that g(a) and h(a) do not commute. Suppose for a while that 
\g(a)\ > \h(a)\, and that t is the common primitive root of g(a) and h(a). From 
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we deduce that h(b) is comparable with h(a s )~ 1 g(a s ), which is an element 
of t + . That is a contradiction with h being marked. Similarly if \g{a)\ < \h(a)\. 
(Clearly, g(a) = h(a) implies g — h.) 

Let us continue the proof of the lemma. Lemma [5] applied once to g and 
once to h gives 

g(a s bu A a r bv) = g(a s bu) A g{a r bv) = h(a s bu) A h{a q bv) = h{a s bu A a q bv). 

(11) 

1. If s ^ r and s ^ q, then (fTTj) yields 

g(a*) = h{a>), 

with i = min(s,r), j = mm(s,q). Therefore the words g(a) and h(a) commute, 
a contradiction. 

2. Suppose next, by symmetry, s — r and s ^ q. Put m = mm(s,q). Equality 
pT|) implies 

g(a s bw) = h(a m ), (12) 

where w = u A v. 

The set C(g,h) contains elements (a s bu,a 3 bu) and (a s bw, a m ), whence the 
rank of C(g, h) is two. Let (e, /) and (e', /') be the blocks of g and h, and let 
g±, hi be their successor morphisms defined by ([5]). 

By symmetry, suppose that pref 1 (/) = a. Equality ^21 implies that there 
is a positive integer p such that f = a p . Since g(a) and h(a) do not commute, 
we deduce that e ^ a + and thus |e| > s. Since a s bu and a 9 &?; are elements of 
{/) /'}*> both s and g are multiples of p. Put 

s q 

si = -, ?i = -, 

p p 

and define words u\ and «i by 

5i(mi) = a s bu, hi(u±) = a s bu, 

gi(v\) — a s bv, hi(v\) = a q bv. 

Since / = a p , the words ui and v\ can be factorized as 

i*i = a Sl bu2, v\ = a qi bv2, 

with U2,V2 € A* . Therefore 

gi(a si bu 2 ) = hi(a si bu2) = a s bu, 
gi(a qi bv 2 ) = h 1 (a Sl bv 2 ) = a s bv. 

Inequality s ^ q implies s\ ^ q\, and \e\ > s yields |a Sl 6u2| < |& s &u|. This 
completes the proof. □ 
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The following two claims exploit the previous lemma. The words a, v a and 
vi, are as in Lemma [26l 

Claim 1. There is no counterexample such that Zh commutes with g(b) and 
pref 1 (<r) = b. 

Proof. Suppose that (g, h) is such a counterexample, and let t be the common 
primitive root of Zh and g(b). Let b e be the maximal b- prefix of av a and b k be 
the maximal ^-prefix of v^a. Then g(b) e is the maximal t-prefix of g(o~v a ) and 
z h,g(b) k is the maximal t- prefix of Zhg{vbO~). 

Suppose that h(b) commutes with g(b). Since \h(b)\ > \g(b)\, the equality 
g(av a ) = h{o~v a ) implies that g(a) is comparable with g(b)~ e h(bY , a contradic- 
tion with g being marked. Therefore h(b) and g(b) do not commute. 

By Lemma IT lfBj) . the maximal t- prefix of h(b)zh is shorter than \h(b)t\. This 
implies, by (JTJ), that all words h(bu) long enough have the same maximal t- 
prefix. In particular, the maximal t-prefix of h(av a ) is the same as the maximal 
t-prefix of h{vba). From h(av a ) — g{o~v a ) and h{v},a)zh — Zhgiy^o) we deduce 
that g{bf = z h g(b) k and k ^ i. 

Put a' = b~ l o~ and note that a' is nonempty since a contains the letter a. 
Then 

Zhg(b k a') = h(b e a')z h , z h g{v h a) = h{v h a)z h , 

and Lemma E51 applied to morphisms h m on and go7r, yields a contradiction. □ 

Claim 2. There is no counterexample such that pref 1 (cr) = a, Zh commutes 
with h(a), and the common primitive root of z% and h(a) is a suffix of g(a). 

Proof. As in the previous proof, suppose that (g, h) satisfies assumptions of the 
claim and let t be the common primitive root of Zh and h(a). Let a 1 be the 
maximal a-prefix of avb, and a k be the maximal a-prefix of v a o~ . 

First, suppose that g(a) commutes with h(a) and Zh- Since g is marked, 
the maximal t-prefix of Zhg(y a a) is Zhg(a) k . From (JTJ) , we deduce that the 
word Zh is the maximal t-prefix of h(bu)zh for any u. Hence the maximal t- 
prefix of h{y a a)zh is h(a) k Zh- The equality Zhg{v a o~) = h(v a a)zh now yields 
Zhg(a) k = h(a) k Zh, a contradiction with \g(a)\ > \h(a)\. Therefore g(a) and 
h(a) do not commute. 

Since, by assumption, t is a suffix of g(a), Lemma l7l(B|) implies that the max- 
imal t-prefix of g(av\f) = h[av\f) is equal to the maximal t-prefix of g(a). Using 
|T]) as above, we deduce from that this maximal t-prefix is equal to h(aYzh- 
In this case, the equality Zhg(v a a) — h(v a a)zh implies Zhh(a) l Zh — h(a) k Zh 
whence z% — h(a) k ~ e . 

For a' = a~ { a we obtain 

z h g(a e a') = h(a k a')z h , z h g(v a a) = h(v a a)z h . 

Since g(a) and h(a) do not commute, we deduce that a a + , whence prefj (a') = 
b and morphisms h m , g satisfy assumptions of Lemma [29l a contradiction. □ 
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7 Case: g is not marked 



In this section we deal with the situation when g is not marked. Note that then 
h is marked by Lemma [T4l and verify that (h ott, g on) is also a counterexample. 
Recall that tt exchanges letters a and b, and it is applied in order to satisfy the 
condition (jcj) of Definition [25l This allows to suppose 

k g \ > \z h \. (13) 

More precisely, if \z g \ < \zh\, then we consider (h o 7r,g o it), instead of (g, h). 

The equality (JTJ applied to reversals implies that z g is a suffix of any g(u) 
long enough. Especially, 

z g is a suffix of g(a) + , (14) 
z g is a suffix of g(b) + . (15) 

Since Zh is a suffix of g(a), which is suffix comparable with z g , we deduce from 
(fig]) that 

z h G suff(z ff ). (16) 
5 = f 2b 



The following claim excludes the situation of this section. 
Claim 3. There is no counterexample with g not marked. 
Proof. Suppose that (<?, h) is such a counterexample. 

1. Suppose first pref 1 (a) = a. The equality g(a) = h(a)zh yields h(a) G 
pref(p(a)), and Zh,g{v a ) = h(y a ) implies that h(a)zh is a prefix of Zhg{a). Thus 
z h h(a) = h(a)z h . 

Let t be the common primitive root of h{a) and z^. From (|16j) we deduce 
that t is a suffix of z g , and (|14p together with \g(a)\ > \h(a)\ > |t| yields that t 
is a suffix of g(a). This is a contradiction with Claim [2] 

2. Suppose then that pref 1 (cr) = b. From ([TBI and (TTBl we deduce that Zhg(b) 
is a suffix of g(&) + . Equalities g(cr) = h(o~)zh and Zhg{vb) = h(vb) imply that 
g(b) is a prefix of h(b), and that /i(6) is comparable with Zhg{b) respectively. 
Therefore g(b) is a prefix of Zhg(b), and Lemma [7t|D|l yields that g(b) and Zft 
commute, a contradiction with Claim [1] □ 



8 Case: h is not marked 

In this subsection we consider the situation when ~g is marked and h is not. We 
shall not exclude this case directly. Instead we reduce it to the case when both 
g and h are marked. 

To accomplish this plan we first we need a description of possible counterex- 
ample structure that is more precise than Lemma [26] 
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Lemma 30. Let (g,h) be a counterexample. Then one of the following possi- 
bilities takes place. 

(A) There exist words <r, /i a , /x& S A + , and r G A* such that 

eq(g,h) = {o-fJ, a T, o-ii b T}, 

where 



Zhg{p-a)z h = h{ji a ), 
Zh.g{pb)z h = h(iJ, b ), 



gr(cr) = h(a)Zh, 

9(t) = z h h[r), 



pref^/ia) = a, 



pref^/Xh) = 6, suffi(/i a ) ^ suffi(// 6 ) • 



(B) There exist words £, /x, p, 77 G A + smc/i i/iai 

cq(g,M = C{pp)*PV = (p(p,p)*r), 

and 



g(C)z h = h(0, 

g{p) = z h h(p)z h , 



zhg(p)z h = h(n), pref x (/i) ^ prefer/), 
ZfcSM = /i(»7), suffi(/i) 7^ suffi(C). 



.9 : 
h 



Proof. Let a and /3 be two shortest elements of eq(<7, ft-). Put a = a A /3, and 
similarly let r be the longest common suffix of a and f3. By Lemma [2T1 applied 
first to <? and h, and then to g and ft , we have 



g(a) = h(a)z h , 
9(r) = z h h{r) . 



(17) 
(18) 



Denote by v and v\ the words o~~ l a and c" 1 /?. Clearly, pref 1 (vo) ^ pref 1 (fi). 
l. First suppose that Vq and Vi are not suffix-comparable. Then with a suitable 
choice of i,j <E {0, 1} we have Vi = fX a T, Vj = /if,r, and pref 1 (/i^) = £ for both 
feA 

Therefore {o~fi a T, o~fJ.bi~} — {a, ft}. We show that a is the unique prefix of a 
{(3 resp.) satisfying (fTT|) . 
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Suppose first that o\02 — a and g{p\) = h(a\)zh- Then it is easy to see that 
also <J\Vi G Eq(g, h), % = 0, 1, a contradiction with a and f3 being the shortest 
elements of eq(g, h). 

Let then Vi = W1W2, for some i G {0,1}, and g(o~w\) = h{aw\)zh- Then 
aw2 is an element of Eq(g, h), which is shorter than avi. Since a and /3 are the 
shortest elements of eq(<?, h), it remains that aw2 — ov\-i. But then vq and v\ 
are suffix-comparable, a contradiction. 

We still have to show that the set {a, (3} generates whole Eq(g, h). Suppose 
that w is an element of Eq(g, h) such that neither a, nor j3 is a prefix of w, and 
consider words Wi — w A avi, i = 0, 1. Lemma I^TI implies that g(wi) = h(wi)Zh, 
for both i = 0, 1. It is easy to deduce that wq and wi cannot be both equal to 
(j, a contradiction with the previous paragraph. Consequently, we have the case 
(A). 

2. Suppose now, by symmetry, that v\ — uvq. Then Zhg(u) — h(u)zh and 
ctu*vq is a subset of Eq(g, h). Moreover, a and era are the only prefixes of crai>o 
satisfying (|17[) . The proof is similar as above: any other prefix satisfying (|17[) 
allows to drop a part of the word, which contradicts the minimality of a and (3. 
We omit details. This, in particular, implies that u is not a suffix of a. 

We show that cru*vo generates the whole equality set. Suppose the contrary, 
and let w be the shortest element of eq(g,ft.) that is not in <7u*vq. As above, 
the words wq = w A <jvq and w\ = w A ctuvq satisfy g{wi) — h(wi)zh, i = 0, 1. 
Therefore wq = a, by the previous paragraph. From pref 1 (u) ^ pref 1 (wo), one 
obtains that w\ is strictly longer than a, which implies wi = au. Therefore 
w = guw' , for some w' . Hence aw' is an element of eq(g, h) shorter than w, and 
thus an element of au*vo. Therefore w' G u*vq and w G au*vo, a contradiction. 

We have seen that u is not a suffix of a. Also a cannot be a suffix of u, 
otherwise craer -1 G Eq(g, h) will contradict the minimality of a and (3. We can 
therefore define p as the longest common suffix of u and a. The word p is not 
empty since z^ is a suffix of both g(u) and g(cr), and g is marked. Denote, 
77 = wo, C = o - / 0-1 an d /i = up^ 1 . 

Note that the word r = p?7 is the longest common suffix of a and /?. Lemma 
[2D applied to (~g,h) yields g(pi]) — z_ h h(prf). The verification of all claims in 
case (B) is now straightforward. □ 

Note that the previous lemma proves, in particular, Theorem [Tt(C|) . 
The following lemma allows to suppose that both g and h are marked, which 
was the task of this section. 

Claim 4. Let (<?, h) be a counterexample. Then there exists also a counterex- 
ample {g\, h\) such that both ~g~[ and h\ are marked. 

Proof. Suppose that (g, h) is a counterexample. Then g is marked by Claim |3l 
Suppose that z_ h ^ e and define g\ and hi by 

9i (u) = g(u), = z h h{u){z h )~ l . 

It is not difficult to see that the morphism h\ is well defined, it is not marked 
while h\ is marked. It remains to show that Eq(gi,/ii) has rank at least two. 
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This is a consequence of the characterization presented in Lemma [3D] (We shall 
use its notation.) 

1. If the case of Lemma 1501 takes place, then 



{T<7/i a , Tan b } C Eq(#i,/ii) 



fJ 
h 



Zh-, — Z h Zh 



Ma 



2. If, on the other hand, we have the case (|Bj) of the Lemma 1501 then 

{PP, PVC} C Eq(5i,/ii) . 

5 = ' l_ 



P 



*>»! =Z h Z h 



l l 



9 ■ 



2 hi ~ £/, 2 'i 



Definitions in Lemma [301 yield pref 1 (/x ) ^ pref 1 (/ib) and pref 1 (/x) ^ pref 1 (?7), 
whence the equality set has in both cases rank at least two. □ 



9 Case: g and h marked. 

From now on we shall suppose that both g and h are marked. Consider Lemma 
l30l It is easy to note that the case (0) of the lemma has to take place, and 
moreover, the word r is empty. Therefore 

eq(g,h) = {<rfi a ,<JHb}, 

with pref x (/i ) = a, pref 1 (^6) = b, and suffi(^ a ) ^ suffi(^ 6 ). 
Note the following useful fact. 

Lemma 31. Let (g, h) be a counterexample such thatg and h are marked. Put 
g\ = g and h\ = h m . Then the pair (gi,hi) is again a counterexample such 
that ~g~\ and h\ are marked, and 

eq(gx,hi) = {a~p^,ajl b ~}. 

Proof. The verification is straightforward. □ 

In this section, we will also need to assume that the pair (g, h) is a shortest 
counterexample. That is, \<Tfj, a \ + \afj,b\ is as small as possible. Shortest coun- 
terexample have the following important properties, which can be summarized 
as: there are no repeated overflows. The proof is similar to the proof of Lemma 
l29l If there is a repeated overflow, then we can decompose the counterexample 
into blocks, and find a shorter counterexample, namely the pair of successor 
morphisms. Since both g and h are marked, we will consider their blocks, 
which are easier to deal with. 
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Lemma 32. Let (g,h) be a counterexample such thatg and h are marked. Let 
two nonempty prefixes o~i and a 2 of a satisfy g(o~x) — h(a 2 ). Then (g,h) is not 
a shortest counterexample. 

Proof. Lemma IT91 applied to morphisms ~g and h implies that pairs (aji^ , crjIZ), 
{crpLb , ojlb ) and (57 , &2 ) can be factorized into a sequence of pairs (e , / ) and 
(e' , /' ) such that g (e ) = h (/ ) and 5 (e' ) = h (/' ). Turning to reversals and 
defining 171 and hi as in ([5]) we obtain words u>, w' £ Eq(<?i, hi) such that 

gi(w) = hi(w) = a^ a , Si(w') = hi(w') = a[i h . 

Note also that ~g~i and hi are marked by Lemma EMI 

Since (0-1,02) is a prefix of both (apL a ,o-pL a ) and (u/Xb, cr/^b), the words w and 
1// have a nonempty common prefix. From g ^ h, it is also easy to see that 
|w| + \w\' < I cr /i a I + 1 07X6 1 . Lemma l27l concludes the proof. □ 

Lemma 33. Let (<?, h) be a shortest counterexample such that g and h marked. 
Let prefixes o~\, a 2l o~[ and o~' 2 of a satisfy 

h{a 2 )- 1 g{ai) = h(^)" 1 g(a[). (19) 

Then 01 = a[ and a 2 = o~ 2 ■ 

Recall that we allow (|19l) to be an equality of two "negative" words if g(cri) < 
h(a 2 ) and g(o-[) < h(a 2 ). 

Proof. Proceed by contradiction. Without loss of generality, we can suppose 
\o~i\ > \a[\ and \a 2 \ > \a 2 \. Let ui be the longest common suffix of o~i and a[, 
and let u 2 be the longest common suffix of a 2 and a' 2 . We want to show that 

giaiu^ 1 ) = h(a 2 u 2 1 ). (20) 

From (TT9"|) and g(o~[i a ) = h(afi a ), we deduce 

g{o-[ai l o -fia) = h^a^o-^a)- (21) 

If u\ — a'i and u 2 = a' 2 , then (|20|) follows from (|2T|) . Otherwise, we apply Lemma 
H6l to morphisms g and h (note that the role of g and h is interchangeable since 
both are marked), and to pairs 

(a 1 ^ 1 a n a ,a' 2 a 2 x aii a ) and (ajI^,aJZ^) 

to obtain 

g^ia^ 1 a Ha) = h(u 2 a 2 x aii a ), 

whence (|20p follows too. 



21 



The rest is Lemma [32] 



9« 




; 9(ui) 




h(a' 2 ) 




! h(u 2 ) 









h(<r 2 ) 



□ 

As a particular case, we point out the following corollary. 

Lemma 34. Let (<?, h) be a shortest counterexample such that g and h marked. 
Let two prefixes o\ and 0% of a satisfy g(o~i) = h(a 2 )zh. Then o\ = o 2 = a. 

9.1 The case: pref^c) = a or suffi (cr) = a 

In this subsection we show that the word cr of a counterexample cannot start 
nor end by the letter a. 

Since \g(a)\ > \h(a)\ and suffi (/i c ) = a for some c S A, we have 

h{a) € suf%(a)). (22) 

Claim 5. There is no counterexample such that both g and h are marked and 
pref 1 (<r) = a or suffi (cr) = Q>- 

Proof. Let first pref 1 (tr) = a. As in the proof of Claim[3] we obtain that Zh and 
h(a) have a common primitive root, say t. From (|22p we have that t is a suffix 
of g(a), which yields a contradiction with Claim [2j 

The case suffi (c) = a follows from the same considerations for morphisms 
g and h m by Lemma [3"T1 □ 

9.2 The case: pref 1 (cr) = suffi (a) = b 

In this subsection we shall suppose that (g, h) is a counterexample such that 
~g and h are marked, and pref 1 (<r) = suffi (cr) = b. We shall restrict possible 
counterexamples to the case pLb <E b + . 
We first fix some notation. 

Convention 35. 

• Denote by £ the maximal integer such that b is a prefix of cr. 

• Denote by k the maximal integer such that b k is a prefix of /Xb<r. 

• Denote by £' the maximal integer such that b is a suffix of a^b or cr// a 
(the one of the two equality words ending with b). 
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• Denote by k! the maximal integer such that b k is a suffix of a. 

We make use of Lemma [31] and suppose that k' > I. In other words, we 
shall work either with (<?, h) or with (g ,h m ) depending on whether a has more 
bs in the front or in the rear. 

We first present some auxiliary lemmas. 

Lemma 36. The words g(b) and h(b) do not commute. 

Proof. Suppose, for a contradiction, that t is the common primitive root of g(b) 
and h(b). Since \g(b)\ < \h(b)\, we deduce, by g(cr) = h(a)zn, that the first 
occurrence of g(a) in g{a) is comparable with t, a contradiction with g being 
marked. □ 

Lemma 37. \h(b)\ >(£ + £'- l)\g(b)\. 

Proof. The word h(b) is comparable with g(b) £ and suffix-comparable with 
g{bf . If \h(b)\ < (£ + £'- l)\g(b)\, then g(b) and h(b) commute by Lemma 
[7 tfB"j) , a contradiction with Lemma [36] 



9(b) 


9(b) 


9(b) 


9(b) 




~~~~~ — — — } 


9(b) 
(h\ — 


9(b) 


9(b) 



□ 

Lemma 38. \z h \ > (£ + k' - l)\g(b)\. 

Proof. The word Zh is comparable with g(b) e , since is comparable with h(b), 
and g(b) £ is a prefix of h(b). Also Zh is suffix-comparable with g(b) k , by g(o~) = 
h(a)z h . 

First suppose that \zh\ > \g(b)\- Now, ii\zh\ < (£ + k' — l)\g(b)\, then Zh and 
g(b) commute by Lemma l7lf6"j) . a contradiction with Claim [T] 

Suppose now that Zh is shorter than g(b). Recall that g(b) is a prefix of g(fib), 
prefix of h(b), and a suffix of g(<j). From g(a) — h{a)zh and Zhg(^b) = h(/j,b) 
we deduce that there is a word v such that g(b) = vzh and at the same time 
gib) = ZhV. Again, the words gib) and Zh commute, a contradiction. 



S V s 


s v s 




,(6) 








^ 2h ^ 






^9(b)^ 



□ 

An important step in the proof is the following lemma which shows that 17(a) 
cannot be too short. 



23 



Lemma 39. \g(ba)\ > \h(b)\. 



Proof. In this proof we shall consider occurrences of g(b)s and h(b)s in g{a) and 
h(a), and their relative position. The idea is quite intuitive, but we give a more 
formal definition. Let i,j < \o~\b be positive integers. Denote by Ui the prefix of 
a such that also u%b is a prefix of cr, and \uib\b = i. 

We say that the ith occurrence of g(b) in g{o~) starts within the j'th occurrence 
of h (b) in h(a), if 

\h{uj)\ < \g(ui)\ < \h(ujb)\. 

Similarly, we say that the ith occurrence of g{b) in g{cr) ends within the jth 
occurrence of h(b) in h(o~), if 

\h( U j)\ < \g(uib)\ < \h(ujb)\. 

Lemma 1551 immplies that the last occurrence of g(b) in g(o~) both starts and 
ends outside h(o~). Therefore, by the pigeon hole principle, there is an occurrence 
of h(b) in h(a) such that no occurrence of g(b) in g{o~) starts within it. Similarly, 
there is an occurrence of h(b) in h(a) within which no g(b) ends. From this it is 
easy to deduce that h(b) is a prefix of sg(a) + , and a suffix of g(a) + p where s is 
a suffix of g(b) or g(a), and p is a prefix of g(b) or g{a). Let t be the primitive 
root of g(a). 

By g(o~) = h(a)zh, the words h(b) and g(b e a) are prefix comparable. Suppose 
that g(b e t) is a prefix of h(b). From the fact that g(b e )g(a) is a prefix of sg(a) + 
we deduce by Lemma FftfC]) that ~g is not marked, a contradiction. Similarly, 
we obtain a contradiction with g being marked, if tg(b) 1 is a suffix of h(b). 
Therefore 

\h(b)\ < \g(b) e t\ and \h(b)\ < \tg(bf\ (23) 

and we are through if £ = 1 . 

Suppose £ > 2. Again by a pigeon hole principle, there are at least two 
occurrences of h(b) in g(a) with no starting g(b). Therefore h(b) is a prefix of 
sig(a) + and S2g{a) + , where Si and S2 are proper suffixes of g{b) or g(a). Note 
that si and S2 are overflows in cr, whence si 7^ S2 by Lemma [33] Suppose, for 
a contradiction, that |<?(&a)| < From si 7^ S2, it is then not difficult to 

deduce that g{a) overlaps nontrivially with g(a) 2 , whence it is not primitive and 
\g(a)\ > 2|t|. From this and from ([2^| we obtain 

2\h(b)\ < \g(b) e t\ + \g(bft\ <(£ + £')\g(b)\ + \g(a)\ <(£ + £'- l)\g(b)\ + \h(b)\, 
a contradiction with Lemma 1371 □ 
We can now once more point out two commuting words. 
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Lemma 40. The word h(b) commutes with Zhg(b) k 



g(bf g(b) e 







9(b) k 


9(a) 




h(b) 


h(b) 
h(h\ — — — — 





Proof. Lemma 1551 and the definition of k' implies that g(b) k is a suffix of zu- 
The assumption k' > I guarantees that Zhg(b) k ~ e is a well defined prefix of 

zhg{b) k - 

Let u be the word g(b)~ h(b), which is a prefix of g(a) by Lemma [39l Since 
\h(b)\ > \g(b)\ and £ > 1, we have 

\h(b) k z h \ > \z h g(b) k - e h(b)\. 
The equality Zhg(^b) — MMf>) now implies that the word 

z h g{b) k u = z h g(b) k - e h(b) 
is a prefix of h(b) + and thus Zhg(b) k ~ e commutes with h(b) by Lemma I7I[B|) . □ 
As a consequence, we have the claim of this section. 

Claim 6. // (g, h) is a shortest counterexample such that pref x (cr) = suff 1 (cr) = 
b, then Hb = b k ~ £ . 

Proof. Let t be the common primitive root of words h(b) and Zhg(b) k ~ e . Recall 
that, by (p}, the maximal t-prefix of any h(au)zh is z%. We deduce the following. 

• The maximal i-prefix of h(o~)zh = g(o~) is h(b) Zh- 

• The maximal t-prefix of h(nbO~)zh = Zhg(nb&) — Zhg(b) k ~ l g(b £ ~ k fibO~) is 
h(b) k Zh, which implies that the maximal t-prcfix of g{b^ k [ibu) is 

{z h g{b) k - l )- x h(bfz h = h(b) k g(by- k . 

Since a contains a, both maximal ^-prefixes mentioned above are proper. 

Let first h(b) k g(b) e ~ k ^ h(b) e z h , and put v — a A b e ~ k : fif,o -. 

If \h(b) k g(b) e ~ k \ > \h(b) e z h \, then g(v) = h(b) e z h , by Lemma[5J a contradic- 
tion with Lemma [34l 

On the other hand, \h(b) k g(b) l ^ k \ < \h(b) l Zh\ implies k < £, and Lemma [8J 
yields g(v) = h(b) k gibf and g(vb k ~ l ) = h(b) k , a contradiction with Lemma 

El 

It remains that h(b) k g(b) e ~ k = h(b) Zh, which implies h(b) k ~ e = Zhg(b) k ~ e 
and Hb = b k ~ e . □ 
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9.3 The case: pref 1 (cr) = suffi(cr) = b and ^ = b k £ 

This last case is most difficult because it in a way compresses two places we use 
for the analysis into one, namely the beginning of a and the beginning of /if,. 



9 
h 



If' 



fJ-a 



Therefore, we have to employ a more detailed analysis of \i a . 

Claim 7. There is no shortest counterexample with pref 1 (cr) = suffi(er) = b. 

Proof. Claim [6] implies pref 1 (/if ) ) = suffi(/it) = b whence 

pref^/x,,) = suffi(ju n ) = a. 

Let V\ be the longest prefix of \i a ending with a and satisfying \zhg{v\)\ > h(v\). 
It follows that \i a = vib m V2 where m > 0, suffi(wi) = pref 1 (u 2 ) = a, and 

\z h g(vi)\ > \h(vi)\, \g(v 2 )\ > \h(v 2 )\. 

Denote u\ — /i(ui) -1 Zhg(vi) and u 2 = g(v2)h(v 2 ) _1 . 

^-Ul u 2 



9{vi) 


g(b) m 


gM 




h{b) m 


h(v 2 ) 



From \h(b)\ > \g(b)\ and Zhg(b k e ) — h(b k e ) we obtain 

\h(b)\ < \z h g(b)\. 



(24) 



Let now y be the prefix of g{ba) of length h(b) and let x be the word such 
that xg{bf = h{bf . From JH]) we deduce that uig{b) m - x y is a prefix of h(b) + . 
Also xg(b) 1 +e ~ 1 y is a prefix of h(b) + . Lemma ITIfE|) now implies that Ui3(6) m_1 
and xg{b) 1 are suffix comparable. Since 5 is marked and both x and u\ 
are suffix comparable with g(a), we deduce to = (. + £' . 

From Lemma 1391 we obtain that h(b) is a prefix of g(b) g(a) whence the 
word uigiby h(b) is a prefix of h(b) m Zh- Lemma I71[B|) now implies that u\g(b) 1 
commutes with h(b). 

Note that U\g(b) is the maximal ft,(6)-suffix of g(av\b ) and xg(b) e is the 
maximal /i(&)-suffix of g(o~(ib)- Minimality of o~fi a implies that 

u 1 g(b e ')^h(b i ')=xg(b e '). 

Let v be the longest common suffix of a/if, and av\b l . We apply Lemma [S] 
to g and obtain that g(v), which is the longest common suffix of g(afj,b) and 
g(av\b £ ), is equal either to u\g(b) 1 or to xg{b) 1 . In both cases, g{v) commutes 
with h(b); let t be their common primitive root. 



2G 



Since g is marked, the maximal t-sufEx of g(o~Hb) is g(u) where u is the 
maximal u-suffix of o~(ib- Since h is marked, the maximal i-sufhx of h(o~(ib) 
is h(b e ). Therefore g(a/ibU~ 1 ) — h{o[Lbb~ l ) = h(ab~ k ), where a[ibU~ x is a 
prefix of a since h(ab~ k ) is a prefix of h{a). Hence (g,h) is not a shortest 
counterexample by Lemma 1321 □ 

This concludes the proof that there is no counterexample. By Lemma 1271 
two minimal elements a and /3 of Eq(g, h) cannot start with the same letter if 
g and h are both non-periodic. Clearly, also ~g and h are non-periodic and a , 
/3 are minimal elements of Eq(g , h ). Theorem [2] is proved. 



10 Test set 

In this section we show that each binary language has a test set of cardinality 
at most two. The result is a consequence of Theorem Q] and Theorem O 

Test set of a language L C S* is defined as a subset T of L such that the 
agreement of two morphisms on the language T guarantees their agreement on 
L. Formally, for any two morphisms g and h defined on E* 

(V«£T) (g(u) = h(u)) ^(V«eL) (g(v) = h(v)). 

The ratio of a word u S A + is denoted by r(u) and defined by 

r(it) 
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If u|f, = 0, then r(u) — oo. A word u is said to be ratio-primitive if no proper 
prefix of u has the same ratio as u. 

It is not difficult to see that each nonempty word u has a unique factorization 
u = u\ . . .Uk where each u, is a nonempty ratio-primitive word such that r(iii) = 
r(u). We call it the ratio-primitive factorization of u. Let R(L) denote the set of 
all ratio-primitive words u such that u occurs in the ratio-primitive factorization 
of at least one word in L. 

Lemma 41. If \g(a)\ ^ [/i(a)|, then \g(u)\ = \h(u)\, u 6 A + , if and only if 

Hb)\ - \g(b)\ 



r(u) 



\g(a)\-\h(a)\- 



If \g(a)\ = \h(a)\ and \g(b)\ ^ then \g(u)\ = \h(u)\, u G A + , if and only 

if r(u) = oo. 

Proof. Follows directly from 

\g( u )\ = Ma • Iff (a) | + Mb • |ff (6) | and \h(u)\ = \u\ a ■ \h(a)\ + \u\ b ■ \h(b)\. 

□ 

An immediate corollary is the following fact. 
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Lemma 42. Binary morphisms g and h agree on L if and only if they agree on 



Here is one more observation. 

Lemma 43. If g(u) = h(u) and g(v) = h(v), with u,v € A + and r(u) ^ r(v), 
then g = h. 

Proof. Since r(u) ^ r(v), the word uv contains both letters a and b. Lemma [4~TI 
implies |<?(a)| = |/i(a)| and \g(b)\ — \h(b)\ whence g — h. □ 

We can now proof the main claim. 

Theorem 44. Let L C A* be a language. Then L possesses a test set of 
cardinality at most two. 

Proof. If L contains words u and v with different ratios, then T = {u, v} is a 
test set of L by Lemma 021 

Suppose that all words in L have the same ratio. We first find a test set Tr 
of cardinality at most two for R(L). If R(L) has cardinality at least three, let 
Tr = {u, v} where u,v £ R(L) and pref 1 (u) = pref 1 (u). 

Let g and h be morphisms such that g ^ h, g(u) — h(u) and g(v) = h(v). 
Since u and v are ratio-primitive, Lemma 1411 implies that u and v are minimal 
elements of Eq(g, h). Therefore both morphisms are periodic by Theorem [TUBl) 
and Theorem [1 By Theorem [ip|. we have R(L) C Eq(g, h). 

Let now T be a subset of L such that R(T) = Tr. Clearly, T can be chosen 
such that its cardinality is at most two. Lemma 02] concludes the proof. □ 
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